Method for identifying differentially expressed proteins

ABSTRACT

The invention provides a method of identifying an antigen that is present in different amounts in two different samples. In general, the methods involve generating a first and second distinguishably labeled population of antibodies that reactive to the two samples, contacting the first and second labeled populations of antibodies with a plurality of antigens; and identifying any resultant antigens that are differentially bound by the first and second populations of antibodies. The antigens may be on the surface of cells e.g., animal cells, or on a solid support. Once identified, the nucleic acid encoding an antigen of interest may be identified and sequenced to reveal the identity of the antigen of interest. Kits for performing the methods are also provided. The methods find most use in medical and research applications, in particular, for identifying cell surface targets for immunotherapy and drug discovery.

CROSS-REFERENCING

This patent application claims priority to U.S. Provisional PatentApplication No. 60/426,029, filed Nov. 12, 2002, which application ishereby incorporated by reference herein in its entirety for allpurposes.

ACKNOWLEDGMENT OF GOVERNMENT SUPPORT

This invention was made with Government support under Grant No.DAMD17-03-C-0039 awarded by the U.S. Army. The Government has certainrights in this invention.

INTRODUCTION

1. Field of the Invention

The field of this invention is molecular biology. The invention relatesto methods of identifying a protein that is differentially expressedbetween the samples, and isolating its encoding nucleic acid.

2. Background of the Invention

An enormous amount of effort is focused on understanding human diseasesand conditions such as cancer, aging, abnormal development,cardiovascular disease and neurodegeneration. The genes that areabnormally expressed in these phenomena are of particular interest.

Several methods have been described for identifying mRNAs whoseexpression is associated with these phenomena. These methods include DNAmicroarrays (Schena et al Science 270:467-70, 1995), RNA display serialanalysis of gene expression, subtraction hybridization, reciprocalsubtraction differential RNA display, representational differenceanalysis, RNA fingerprinting by arbitrarily primed PCR, electronicsubtraction, combinational matrix gene analysis, and signal sequencetrapping (see U.S. Pat. No. 5,536,637).

However, these methods have several shortcomings.

One limitation is that changes in mRNA levels often do not correlatewith changes in their encoded protein levels. As such, several “falsepositives” may be identified by the above methods.

Another limitation is that expression of a particular protein may beup-regulated without a concomitant increase in mRNA expression. Such anincrease in gene expression would be undetectable using the abovemethods.

Another limitation is that a cell employs a wide variety ofpost-translational modifications to regulate the activity of a protein,and these modifications are undetectable using the above methods.

Furthermore, the preceding methods are not useful for identifyingproteins that are localized in specific subcellular compartments such ascell membrane, nuclei, cytoplasm and extracellular space.

In one attempt to overcome these limitations, Scherer et al (NatBiotechnol. 16:581-6, 1998) described a method that utilizedimmunodepleted rabbit polyserum to screen a cDNA bacteriophage cDNAlibrary. This method involves raising polyclonal antibodies for a firstcell type and immunodepleting the antibodies using a second cell type inorder to isolate antibodies that are specific for the first cell type.While this method was successful for isolating a handful of cDNAs,methods using immunodepletion, in general, also have a number of seriouslimitations. Firstly, immunodepletion methods are tedious and most ofteninvolve more than one depletion procedure. Secondly, immunodepletion isnever complete and the antibodies that are not removed by this proceduretypically cause a large number of false positives. Thirdly, rareantibodies in the non-depleted antiserum may be depleted due tonon-specific binding or carry-over by the depleting material.

As such, there is a need for improved methods for detecting andidentifying gene products, especially proteins, that are abnormallyexpressed in human conditions and diseases. This invention meets these,and other needs.

LITERATURE

The following literature may be of interest: Scherer et al. (NatBiotechnol. 16:581-6, 1998), Zannettino et al. (J Immunol 156:611-20,1996), Bickel et al. (Biotechnol Genet Eng Rev 17:417-30, 2000), Shustaet al. (Molecular and Cellular Proteomics 1.1 75-82, 2001) and U.S. Pat.No. 5,506,126.

SUMMARY OF THE INVENTION

The invention provides a method of identifying an antigen that ispresent in different amounts in two different samples. In general, themethods involve generating a first and second distinguishably labeledpopulation of antibodies that reactive to the two samples, contactingthe first and second labeled populations of antibodies with a pluralityof antigens; and identifying any resultant antigens that aredifferentially bound by the first and second populations of antibodies.The antigens may be on the surface of cells e.g., animal cells, or on asolid support. Once identified, the nucleic acid encoding an antigen ofinterest may be identified and sequenced to reveal the identity of theantigen of interest. Kits for performing the methods are also provided.The methods find most use in medical and research applications, inparticular, for identifying cell surface targets for immunotherapy anddrug discovery.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic diagram of an embodiment of the present invention.

FIG. 2 is a schematic diagram of another embodiment of the presentinvention.

FIG. 3 shows four panels of graphs, A-D, showing FACS analysis of HeLacDNA library-infected 240E cells. Cells were stained using eitheranti-HeLa (PE) or anti-NHDF (FITC) antibodies.

FIG. 4 shows two panels of graphs showing FACS analysis of HeLa cDNAlibrary-infected 240E cells, competitively bound by anti-HeLa (PE) andanti-NHDF (FITC) antibody probes.

FIG. 5 shows a DNA gel showing PCR amplification of Hela genes from 3 ofthe 240E stable cell lines. 1 kb marker and clone numbers are indicatedon the top. Clone 2 showed a specific PCR product.

FIG. 6 shows six panels of graphs showing FACS analysis of the 240Estable clones

DEFINITIONS

The terms “antibody” and “immunoglobulin” are used interchangeablyherein. These terms are well understood by those in the field, and referto a protein consisting of one or more polypeptides that specificallybinds an antigen. One form of antibody constitutes the basic structuralunit of an antibody. This form is a tetramer and consists of twoidentical pairs of antibody chains, each pair having one light and oneheavy chain. In each pair, the light and heavy chain variable regionsare together responsible for binding to an antigen, and the constantregions are responsible for the antibody effector functions.

The recognized immunoglobulin polypeptides include the kappa and lambdalight chains and the alpha, gamma (IgG₁, IgG₂, IgG₃, IgG₄), delta,epsilon and mu heavy chains or equivalents in other species. Full-lengthimmunoglobulin “light chains” (of about 25 kDa or about 214 amino acids)comprise a variable region of about 110 amino acids at the NH₂-terminusand a kappa or lambda constant region at the COOH-terminus. Full-lengthimmunoglobulin “heavy chains” (of about 50 kDa or about 446 aminoacids), similarly comprise a variable region (of about 116 amino acids)and one of the aforementioned heavy chain constant regions, e.g., gamma(of about 330 amino acids).

The terms “antibodies” and “immunoglobulin” include antibodies orimmunoglobulins of any isotype, fragments of antibodies which retainspecific binding to antigen, including, but not limited to, Fab, Fv,scFv, and Fd fragments, chimeric antibodies, humanized antibodies,single-chain antibodies, and fusion proteins comprising anantigen-binding portion of an antibody and a non-antibody protein. Theantibodies may be detectably labeled, e.g., with a radioisotope, anenzyme which generates a detectable product, a fluorescent protein, andthe like. The antibodies may be further conjugated to other moieties,such as members of specific binding pairs, e.g., biotin (member ofbiotin-avidin specific binding pair), and the like. The antibodies mayalso be bound to a solid support, including, but not limited to,polystyrene plates or beads, and the like. Also encompassed by the termsare Fab′, Fv, F(ab′)₂, and or other antibody fragments that retainspecific binding to antigen.

Antibodies may exist in a variety of other forms including, for example,Fv, Fab, and (Fab′)₂, as well as bi-functional (i.e. bi-specific) hybridantibodies (e.g., Lanzavecchia et al., Eur. J. Immunol. 17, 105 (1987))and in single chains (e.g., Huston et al., Proc. Natl. Acad. Sci.U.S.A., 85, 5879-5883 (1988) and Bird et al., Science, 242, 423-426(1988), which are incorporated herein by reference). (See, generally,Hood et al., “Immunology”, Benjamin, N.Y., 2nd ed. (1984), andHunkapiller and Hood, Nature, 323, 15-16 (1986),).

It is understood that the humanized antibodies designed and produced bythe present method may have additional conservative amino acidsubstitutions which have substantially no effect on antigen binding orother antibody functions. By conservative substitutions is intendedcombinations such as gly, ala; val, ile, leu; asp, glu; asn, gin; ser,thr; lys, arg; and phe, tyr.

As used herein, the terms “determining,” “measuring,” and “assessing,”and “assaying” are used interchangeably and include both quantitativeand qualitative determinations.

The terms “polypeptide” and “protein”, used interchangeably herein,refer to a polymeric form of amino acids of any length, i.e. greaterthan 2 amino acids, greater than about 5 amino acids, greater than about10 amino acids, greater than about 20 amino acids, greater than about 50amino acids, greater than about 100 amino acids, greater than about 200amino acids, greater than about 500 amino acids, greater than about 1000amino acids, greater than about 2000 amino acids, usually not greaterthan about 10,000 amino acids, which can include coded and non-codedamino acids, chemically or biochemically modified or derivatized aminoacids, and polypeptides having modified peptide backbones. The termincludes fusion proteins, including, but not limited to, fusion proteinswith a heterologous amino acid sequence, fusions with heterologous andhomologous leader sequences, with or without N-terminal methionineresidues; immunologically tagged proteins; fusion proteins withdetectable fusion partners, e.g., fusion proteins including as a fusionpartner a fluorescent protein, β-galactosidase, luciferase, etc.; andthe like. Also included by these terms are polypeptides that arepost-translationally modified in a cell, e.g., glycosylated, cleaved,secreted, prenylated, carboxylated, phosphorylated, etc, andpolypeptides with secondary or tertiary structure, and polypeptides thatare covalently or non-covalently bound to other moieties, e.g., otherpolypeptides, atoms, cofactors, etc.

As used herein the term “isolated,” when used in the context of anisolated antibody, refers to an antibody of interest that is at least60% free, at least 75% free, at least 90% free, at least 95% free, atleast 98% free, and even at least 99% free from other components withwhich the antibody is associated with prior to purification.

A “coding sequence” or a sequence that “encodes” a selected polypeptide,is a nucleic acid molecule which is transcribed (in the case of DNA) andtranslated (in the case of mRNA) into a polypeptide, for example, invivo when placed under the control of appropriate regulatory sequences(or “control elements”). The boundaries of the coding sequence aretypically determined by a start codon at the 5′ (amino) terminus and atranslation stop codon at the 3′ (carboxy) terminus. A coding sequencecan include, but is not limited to, cDNA from viral, procaryotic oreucaryotic mRNA, genomic DNA sequences from viral or procaryotic DNA,and synthetic DNA sequences. A transcription termination sequence may belocated 3′ to the coding sequence. Other “control elements” may also beassociated with a coding sequence. A DNA sequence encoding a polypeptidecan be optimized for expression in a selected cell by using the codonspreferred by the selected cell to represent the DNA copy of the desiredpolypeptide coding sequence.

“Encoded by” refers to a nucleic acid sequence which codes for apolypeptide sequence, wherein the polypeptide sequence or a portionthereof contains an amino acid sequence of at least 3 to 5 amino acids,more preferably at least 8 to 10 amino acids, and even more preferablyat least 15 to 20 amino acids from a polypeptide encoded by the nucleicacid sequence. Also encompassed are polypeptide sequences that areimmunologically identifiable with a polypeptide encoded by the sequence.

“Operably linked” refers to an arrangement of elements wherein thecomponents so described are configured so as to perform their usualfunction. Thus, a given signal peptide that is operably linked to apolypeptide directs the secretion of the polypeptide from a cell. In thecase of a promoter, a promoter that is operably linked to a codingsequence will direct the expression of a coding sequence. The promoteror other control elements need not be contiguous with the codingsequence, so long as they function to direct the expression thereof. Forexample, intervening untranslated yet transcribed sequences can bepresent between the promoter sequence and the coding sequence and thepromoter sequence can still be considered “operably linked” to thecoding sequence.

By “nucleic acid construct” it is meant a nucleic acid sequence that hasbeen constructed to comprise one or more functional units not foundtogether in nature. Examples include circular, linear, double-stranded,extrachromosomal DNA molecules (plasmids), cosmids (plasmids containingCOS sequences from lambda phage), viral genomes comprising non-nativenucleic acid sequences, and the like.

A “vector” is capable of transferring gene sequences to target cells.Typically, “vector construct,” “expression vector,” and “gene transfervector,” mean any nucleic acid construct capable of directing theexpression of a gene of interest and which can transfer gene sequencesto target cells, which can be accomplished by genomic integration of allor a portion of the vector, or transient or inheritable maintenance ofthe vector as an extrachromosomal element. Thus, the term includescloning, and expression vehicles, as well as integrating vectors.

An “expression cassette” comprises any nucleic acid construct capable ofdirecting the expression of a gene/coding sequence of interest, which isoperably linked to a promoter of the expression cassette. Such cassettescan be constructed into a “vector,” “vector construct,” “expressionvector,” or “gene transfer vector,” in order to transfer the expressioncassette into target cells. Thus, the term includes cloning andexpression vehicles, as well as viral vectors.

A first polynucleotide is “derived from” a second polynucleotide if ithas the same or substantially the same nucleotide sequence as a regionof the second polynucleotide, its cDNA, complements thereof, or if itdisplays sequence identity as described above. A first polynucleotidemay be derived from a second polynucleotide if the first polynucleotideis used as a template for, e.g. amplification of the secondpolynucleotide.

A first polypeptide is “derived from” a second polypeptide if it is (i)encoded by a first polynucleotide derived from a second polynucleotide,or (ii) displays sequence identity to the second polypeptides asdescribed above. The term “unit dosage form” as used herein refers tophysically discrete units suitable as unitary dosages for subjects(e.g., animals, usually humans), each unit containing a predeterminedquantity of an agent, e.g. an antibody in an amount sufficient toproduce the desired effect in association with a pharmaceuticallyacceptable diluent, carrier or vehicle. The specifications for the novelunit dosage forms of the present invention will depend on a variety offactors including, but not necessarily limited to, the particular agentemployed and the effect to be achieved, and the pharmacodynamicsassociated with each compound in the host.

A polynucleotide is “derived from” a particular cell if thepolynucleotide was obtained from the cell. A polynucleotide may also be“derived from” a particular cell if the polynucleotide was obtained fromthe progeny of the cell, as long as the polynucleotide was present inthe original cell. As such, a single cell may be isolated and cultured,e.g. in vitro, to form a cell culture. A nucleotide isolated from thecell culture is “derived from” the single cell, as long as the nucleicacid was present in the isolated single cell.

A cell is “derived from” a host if the cell was obtained from the host.The progeny of a progenitor cell are derived from the same host as aprogenitor cell. A cell may be “derived from” the same species as thehost if the cell was isolated from an animal of the same species as thehost animal. For example, NIH 3T3 cells are derived from mouse, 240Ecells are derived from rabbit, and DT-40 cells are derived from chicken.The progeny of a progenitor cell are derived from the same species asthe progenitor cell.

An antigen is “native” to a cell if the antigen is usually expressed bythe cell. For example, rabbit 240E cells express rabbit polypeptideantigens. An antigen is “not-native” to a cell if the antigen is notusually expressed by the cell. For example, rabbit 240E cells do notusually express human polypeptide antigens, i.e., a polypeptide encodedby the human genome, and, as such, a human polypeptide is not native toa rabbit 240E cell.

The terms “treatment” “treating” and the like are used herein to referto any treatment of any disease or condition in a mammal, e.g.particularly a human or a mouse, and includes: a) preventing a disease,condition, or symptom of a disease or condition from occurring in asubject which may be predisposed to the disease but has not yet beendiagnosed as having it; b) inhibiting a disease, condition, or symptomof a disease or condition, e.g., arresting its development and/ordelaying its onset or manifestation in the patient; and/or c) relievinga disease, condition, or symptom of a disease or condition, e.g.,causing regression of the condition or disease and/or its symptoms.

The terms “subject,” “host,” “patient,” and “individual” are usedinterchangeably herein to refer to any mammalian subject for whomdiagnosis or therapy is desired, particularly humans. Other subjects mayinclude cattle, dogs, cats, guinea pigs, rabbits, rats, mice, horses,and so on.

Before the present subject invention is described further, it is to beunderstood that this invention is not limited to particular embodimentsdescribed, as such may, of course, vary. It is also to be understoodthat the terminology used herein is for the purpose of describingparticular embodiments only, and is not intended to be limiting, sincethe scope of the present invention will be limited only by the appendedclaims.

Where a range of values is provided, it is understood that eachintervening value, to the tenth of the unit of the lower limit unlessthe context clearly dictates otherwise, between the upper and lowerlimits of that range is also specifically disclosed. Each smaller rangebetween any stated value or intervening value in a stated range and anyother stated or intervening value in that stated range is encompassedwithin the invention. The upper and lower limits of these smaller rangesmay independently be included or excluded in the range, and each rangewhere either, neither or both limits are included in the smaller rangesis also encompassed within the invention, subject to any specificallyexcluded limit in the stated range. Where the stated range includes oneor both of the limits, ranges excluding either or both of those includedlimits are also included in the invention.

Unless defined otherwise, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which this invention belongs. Although any methods andmaterials similar or equivalent to those described herein can be used inthe practice or testing of the present invention, the preferred methodsand materials are now described. All publications mentioned herein areincorporated herein by reference to disclose and describe the methodsand/or materials in connection with which the publications are cited.

It must be noted that as used herein and in the appended claims, thesingular forms “a”, “and”, and “the” include plural referents unless thecontext clearly dictates otherwise. Thus, for example, reference to “anantibody” includes a plurality of such antibodies and reference to “avariable domain” includes reference to one or more variable domains andequivalents thereof known to those skilled in the art, and so forth.

The publications discussed herein are provided solely for theirdisclosure prior to the filing date of the present application. Nothingherein is to be construed as an admission that the present invention isnot entitled to antedate such publication by virtue of prior invention.Further, the dates of publication provided may be different from theactual publication dates which may need to be independently confirmed.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

The invention provides a method of identifying an antigen that ispresent in different amounts in two different samples. In general, themethods involve generating a first and second distinguishably labeledpopulation of antibodies that reactive to the two samples, contactingthe first and second labeled populations of antibodies with a pluralityof antigens; and identifying any resultant antigens that aredifferentially bound by the first and second populations of antibodies.The antigens may be on the surface of cells e.g., animal cells, or on asolid support. Once identified, the nucleic acid encoding an antigen ofinterest may be identified and sequenced to reveal the identity of theantigen of interest. Kits for performing the methods are also provided.The methods find most use in medical and research applications, inparticular, for identifying cell surface targets for immunotherapy anddrug discovery.

In describing the invention, the compositions for use in the subjectmethods are described first, followed by a description of the methodsthemselves. Finally, kits for performing the subject methods aredescribed, as well as several applications in which the subject methodsfind use.

Samples

The invention provides a method of identifying an antigen that ispresent in two samples, i.e., a first and a second sample, in differingamounts. Samples that are complex are usually used in the subjectmethod, and accordingly, biological samples are usually used. By complexis meant that the samples contain a mixture of 10⁵, 10⁶, 10⁷, 10⁸ or 10⁹or more different antigens, where an antigen is a target for anantibody. Accordingly, the samples used in the subject methods usuallyconsist of a mixture at least 10³, 10⁴, 10⁵, etc., or more differentconstituents, e.g., proteins and the like. A suitable sample may be anysample that can act as an immunogen in an animal, e.g., a rabbit, mouse,chicken, etc., to produce a polyclonal antisera with an associationconstant of at least 10⁵M against the sample. In many embodiments,therefore, a sample is a cell, usually a pathogen or a mammalian cell orfraction thereof, and antigens of particular interest include those atthe surface of the cell (i.e., antigens that are exposed to the outsideof a cell and accessible to an antibody when the cell is intact). By“pathogen or mammalian cell” is meant a cell from a pathogen e.g. abacteria or yeast, or a mammal, e.g., a human cell, or derivativethereof, including those cells grown in vitro, recombinant cells, cellfusions, and cells infected with a pathogen, and the like, etc. By“fraction of a cell” is meant a subset of the total constituents of acell that has been separated from other constituents of the cell usingany means, e.g., physical or biochemical mean. Fractions of a cell thatare of interest include fractions containing membrane proteins, solubleproteins, plasma membrane proteins, nuclear extract, nuclear membraneproteins, and the like, and biochemical fractions such as those obtainedby separating the constituents of a cell by chromatography, e.g.,affinity chromatography. In most embodiments, samples contain proteinsin their native conformation, i.e., the samples have not been treated inany way to denature or solubilize the proteins contained in the sample.

Accordingly, the subject methods may be used to investigate: thesubcellular localization of proteins, secretion of proteins (e.g.,hormones, cytokines, growth factors, extracelluar matrix proteins,etc.), cell surface proteins (including receptors, channels,transporters and adhesion molecules, etc.) intracellular proteins(including enzymes, metabolic machinery proteins, signaling proteins andstructural proteins). As discussed above, physical separation of thesedifferent classes of proteins by biochemical methods is well developedin the art. In addition, methods have developed to isolate specificgroups of cells from a given tissue. It is possible to purify specificclass of proteins from a specific cell type before using these proteinmaterials in the process of this invention.

As noted above, two different samples are usually used in the subjectmethods. In most embodiments, the two samples are pair of samplesconsisting of an “experimental” sample, i.e., a sample of interest, anda “control” sample to which the experimental sample may be compared. Inmany embodiments, therefore, the subject samples are pairs of cell typesor fraction thereof, one cell type being a cell type of interest, e.g.,abnormal cells, and the other a control, e.g., normal, cell type. If twofractions of cells are compared, the fractions are usually the samefraction from each of the two cells. In certain embodiments, however,two fractions of the same cell may be compared. Exemplary cell typepairs include, for example, cells isolated from a tissue biopsy (e.g.,from a tissue having a disease such as colon, breast, prostate, lung,skin cancer, or infected with a pathogen etc.) and normal cells from thesame tissue, usually from the same patient; cells grown in tissueculture that are immortal (e.g., cells with a proliferative mutation oran immortalizing transgene), infected with a pathogen, or treated (e.g.,with environmental or chemical agents such as peptides, hormones,altered temperature, growth condition, physical stress, cellulartransformation, etc), and a normal cell (e.g., a cell that is otherwiseidentical to the experimental cell except that it is not immortal,infected, or treated, etc.); a cell isolated from a mammal with acancer, a disease, a geriatric mammal, or a mammal exposed to acondition, and a cell from a mammal of the same species, preferably fromthe same family, that is healthy or young; and differentiated cells andnon-differentiated cells from the same mammal (e.g., one cell being theprogenitor of the other in a mammal, for example). In one embodiment,cells of different type, e.g., neuronal and non-neuronal cells, or cellsof different status (e.g. before and after a stimulus on the cells) maybe used. In another embodiment of the invention, the experimentalmaterial is cells susceptible to infection by a pathogen such as avirus, e.g. human immunodeficiency virus (HIV), etc., and the controlmaterial is cells resistant to infection by the pathogen. In anotherembodiment of the invention, the sample pair is represented byundifferentiated cells, e.g., stem cells, and differentiated cells.

Accordingly, the methods usually involve administering two samples, anexperimental sample and a control sample, each containing a plurality ofdifferent isolated antigens to different animals of the same species(e.g., two rabbits) such that antibodies that bind at least a subset ofthe plurality of different antigens of each sample are made by theanimal (i.e. greater than 10%, greater than about 20%, greater thanabout 40%, greater than about 60%, greater than about 70%, greater thanabout 80% or even greater than about 90% or even greater than about 95%,up to 100% of the antigens). In most embodiments, the antigens arebiopolymers such as different polypeptides, oligopeptides, proteins,protein fragments, nucleic acids, polysaccharides, carbohydrates, lipidsor oils, other molecules such as small inorganic or organic molecules,or mixtures or modified variants thereof, particularly those found onthe surface of a cell. In many embodiments in which cellular samples areused, the cells are usually derived from a different species to theanimal to be immunized by the sample. For example, if human samples areof interest, then rabbits, chickens or mice, etc., may be immunized.

Procaryotic (e.g. bacterial) and eucaryotic (e.g. insect, vertebrate,mammalian, human) cells or tissues may be used as samples. Human cellsand tissues include, but not limited to, epithelial cells, endothelialcells, fibroblasts, nervous tissue cells, immune cells, muscle cells,hepatocytes, immoralized cells, malignant cells, viral infected cellsand cells in a particular physiological and pathological state, etc.

In most embodiments, an antigen is a polypeptide, and the polypeptide isfrom about 9-about 15 amino acids in length, about 16 to about 40 aminoacids in length, about 41 to about 60 amino acids in length, about 61 toabout 100 amino acids in length, about 101 to about 200 amino acids inlength, about 201 to about 300 amino acids in length, about 301 to about400 amino acids in length, about 401 to about 500 amino acids in length,usually less than about 1000 amino acids in length.

Antibodies

In general, the methods involve immunizing a pair of identical animals(i.e., genetically identical or inbred animal), for example a pair ofsuitable mammals such as mice, rabbits, guinea pig, or a suitableavians, such as a chickens, with a sample pair, as described above, suchthat the animals produce antibodies against the samples. If the samplesare cells, they may be termed “cellular immunogens”. A cellularimmunogen comprises more than about 10⁴, more than about 10⁵, more thanabout 10⁶, more than about 10⁷, more than about 10⁸, and usually no morethan about 10⁹ or 10¹⁰ cells.

In most embodiments, the cells of a cellular immunogen are usuallyderived from a species different to the animal being immunized (e.g., ifthe cells are human cells, the cells may be used to immunize rabbits),and, accordingly, the animals usually mount an immune response againstthe cells. In certain embodiments, the cells are immortalized cells, andcells may be administered as living cells.

Methods of immunizing animal, including the adjuvants used, boosterschedules, sites of injection, suitable animals, etc. are wellunderstood in the art, e.g., Harlow et al. (Antibodies: A LaboratoryManual, First Edition (1988) Cold spring Harbor, N.Y., and Harlow,supra), and administration of living cells to animals has been describedfor several mammals and birds, e.g., McKenzie et al (Oncogene 4:543-8,1989), Scuderi et al (Med. Oncol. Tumor Pharmacother 2:233-42, 1985),Roth et al (Surgery 96:264-72, 1984) and Drebin et al (Nature 312:545-8,1984).

In many embodiments more than about 10⁴, more than about 10⁵, more thanabout 10⁶, more than about 10⁷, more than about 10⁸, and usually no morethan about 10⁹ or 10¹⁰ antigens (or cells) are administered to theanimal provide for antibodies against the antigens.

After the animals have been immunized, the animals usually mount animmune response against the plurality of antigens, and the blood of suchanimals will normally contain polyclonal antisera that bind (e.g., byELISA, western blot, etc.), depending on how the methods are performed,at least about 5%, at least about 10%, at least about 20%, at leastabout 30%, at least about 40%, at least about 50%, at least about 80%,usually not more than about 90% or 95% or 100% of the plurality ofantigens. Binding affinities of the polyclonal antisera to the antigensmay vary between antigen, but will generally be in the range of at leastabout 10⁶ M⁻¹ at least about 10⁷ M⁻¹, at least about 10⁸ M⁻¹, or atleast about 10⁹ M⁻¹ to 10¹⁰ M⁻¹) for the cells. Polyclonal antisera areusually harvested in the immunized animals using methods well known inthe art and used in the subject methods.

Accordingly, two antisera are usually produced, one for each of the twosamples.

Libraries

A variety of libraries may be used in the subject methods, includingcell and bacteriophage cDNA expression libraries, such libraries arewell known in the art (see Ausubel and Sambrook, supra) and need not bediscussed here in any more detail. In most embodiments, the librariesare made from the experimental sample, e.g., the library may be a cDNAlibrary made from RNA isolated from the cells of the experimentalsample, or a sample similar thereto.

In certain embodiments, the subject libraries may be provided byintroducing cDNA vectors, e.g., a plurality of retroviral vectorscontaining individual cDNAs, or the like, into suitable animal cells.For example, a retroviral cDNA library may be made from cells of anexperimental sample and introduced into host cell of a different speciesto the experimental cells in order to provide for expression of theexperimental sample proteins. In many embodiments, the cDNA librariesare introduced into host cells are from the same species as the animalthat was immunized with the host cells. For example, if the experimentalsample cells are human cells, the host cells into which the cDNAs aremade from the experimental sample cells are introduced may be anon-human animal cell, e.g., a mouse, rabbit, hamster, goat, chicken oravian (chicken) cell, depending on the animal used for immunizations.For example, if a mouse, chicken or rabbit is immunized, the host cellfor a library may be a mouse cell (e.g., NIH 3T3), a chicken cell (e.g.,DT-40) or a rabbit cell (e.g., 240E), etc. Animal cells usuallycorrectly post-transcriptionally modify, target and fold other animalproteins, and, as such are most usually used in the subject methods.

Accordingly, in many embodiments, the animal cell libraries produceantigens that are not native to the host cells used, i.e., are notusually expressed in the host cells, and are usually derived from adifferent species than the host cells. For example, if a host cells arerabbit host cells, the antigens may be human or mouse polypeptides, andif the host cells are mouse host cell, the antigens may be a human orrabbit polypeptides.

In most embodiments, therefore, a plurality of different antigens isexpressed in a population of host cells through introduction of aplurality of different nucleic acids encoding the antigens. In mostembodiments, the nucleic acids are comprised within expressioncassettes.

A library of different nucleic acids is usually transferred into apopulation of host cells such that the population of host cells producesthe plurality of antigens. In some embodiments, the proteins areproduced within the cell and the cell secretes and/or surface targetscertain antigens. In certain embodiments, a single host cell may receivemore than one antigen-encoding nucleic acid and may produce more thanone antigen, whereas other cells may receive one antigen-encodingnucleic acid and produce one antigen. The population of cells, ingeneral, is usually a mixture of single antigen and multiple antigenexpressing cells, although populations in which single cells expresssingle antigens and populations in which single cells express multipleantigens are also envisioned.

In one embodiment, the antigen encoding nucleic acids are nucleic acidsidentified as having a particular property, e.g., they may bedifferentially expressed in a disease such as a cancer, for examplecolon or breast cancer, or expressed in a certain tissue, or expressedat a certain time during normal or abnormal development, or encodepolypeptides with certain activities, e.g. secreted polypeptides,membrane bound polypeptides, cell surface polypeptides, or have acertain other activity. Such nucleic acids may be identified usingconventional gene expression (e.g., microarray), subtractivehybridization (see, e.g, J Cancer Res Clin Oncol. 1997; 123:447-51 andGene 2001 262:207-14), or conventional library screening technologies.Antigen encoding nucleic acids may also be unselected, meaning that theyare not selected because of their expression pattern or activity of theantigen.

In certain embodiments, a plurality of expression cassettes containingdifferent antigen-encoding nucleic acids is provided by a cDNA library,where the cDNA is cloned into a vector suitable for the expression thecDNA. Such vectors typically provide a promoter suitable for use in hostcells operably linked to the cDNA, and as such, a plurality of suchvectors will provide a plurality of expression cassettes for expressionof different antigens in host cells. Exemplary vectors suitable forlibrary construction include pCI from Promega (Madison, Wis.), Retro-Xsystem from Clontech (Palo Alto, Calif.) and pCDNA3.1 from Invitrogen(Carlsbad, Calif.) and cDNA libraries in suitable vectors (e.g.,retroviral expression libraries or plasmid expression libraries) can bepurchased from Clontech (Palo Alto, Calif.) and Stratagene (La Jolla,Calif.) or from EdgeBiosystems (Gaithersburg, Md.). In certainembodiments, the vectors may contain secretion signal or cell surfacetargeting sequences, as described in further detail below.

By plurality is meant more than 1, for example more than 2, more thanabout 5, more than about 10, more than about 20, more than about 50,more than about 100, more than about 200, more than about 500, more thanabout 1000, more than about 2000, more than about 5000, more than about10,000, more than about 20,000, more than about 50,000, more than about100,000, usually no more than about 200,000. A “population” contains aplurality of items.

In many embodiments, the methods involve modifying host cells to makemodified host cells that produce at least one antigen, e.g., at leastone polypeptide, protein, fragment of a polypeptide or protein,post-translationally modified polypeptide etc. In certain embodiments,antigens are secreted from the modified host cells and/or targeted tothe surface of the modified host cells such that antigenic epitopes ofthe polypeptide are presented on the outside of the cells. For example,a transmembrane receptor kinase may be targeted to the surface of anindividual cell, and may present antigenic epitopes on the ligandbinding domain on the outside of the cell.

In most embodiments, the modified host cells produce antigens in themodified host cells, which, in certain embodiments, may bepost-transcriptionally and/or post-translationally modified in themodified host cells such that the polypeptides are cleaved, rearranged,covalently modified, e.g., glycosylated, phosphorylated, prenylated,acetylation, amidation, carbamylated, deamidation, farnesylated,formylation, geranyl-geranylated, methylated, or myristoylated, etc., asthey would in a test cell.

In most embodiments, a single antigen of the plurality of antigens isencoded by a nucleic acid encoding at least the primary structure of theantigen, and the antigen-encoding nucleic acid is introduced into a hostcell to provide for production of the antigen. In certain embodimentsthe nucleic acid encoding the antigen is a cDNA, sometimes a full-lengthcDNA, which means that the cDNA encodes a full length polypeptide. Thesequence of the nucleic acid may be determined, may be partiallydetermined, or may be unknown, and the nucleic acid may be selected,e.g., based on its expression under certain conditions, or may beunselected. In most embodiments, the nucleic acid is derived from adifferent species from which the host cell is derived, e.g., humans, ifthe host cell is a non-human animal. As such, the modified host cell isusually a recombinant host cell, and the antigen is encoded by a nucleicacid. In certain embodiments, the antigen is a post-translationallymodified antigen and may have associated secondary, tertiary orquaternary structures.

A single antigen of the plurality is usually expressed in the modifiedhost cell using an expression cassette containing a nucleic acidsequence encoding the antigen. Expression cassettes, including suitablepromoters (e.g., inducible promoters) terminators, enhancers,translation initiation signals, translational enhancers, are well knownin the art, and are discussed in Ausubel, et al, (Short Protocols inMolecular Biology, 3rd ed., Wiley & Sons, 1995) and Sambrook, et al,(Molecular Cloning: A Laboratory Manual, Third Edition, (2001) ColdSpring Harbor, N.Y.). Suitable promoters include SV40 elements, asdescribed in Dijkema et al., EMBO J. (1985) 4:761; transcriptionregulatory elements derived from the LTR of the Rous sarcoma virus, asdescribed in Gorman et al., Proc. Nat'l Acad. Sci USA (1982) 79:6777;transcription regulatory elements derived from the LTR of humancytomegalovirus (CMV), as described in Boshart et al., Cell (1985)41:521; hsp70 promoters, (Levy-Holtzman, R. and I. Schechter (Biochim.Biophys. Acta (1995) 1263: 96-98) Presnail, J. K. and M. A. Hoy, (Exp.Appl. Acarol. (1994) 18: 301-308)) and the like. The expressionpolynucleotide provides expression cassettes for expression of anantigen in a host cell. In most embodiments, each expression cassette ismore than about 0.5 kb in length, more than about 1.0 kb in length, morethan about 1.5 kb in length, more than about 2 kb in length, more thanabout 4 kb in length, more than about 5 kb in length, and is usuallyless than 10 kb in length.

The expression cassette may be linear, or encompassed in a circularvector, which may further comprise a selectable marker. Suitablevectors, e.g., viral and plasmid vectors, and selectable markers arewell known in the art and discussed in Ausubel, et al, (Short Protocolsin Molecular Biology, 3rd ed., Wiley & Sons, 1995) and Sambrook, et al,(Molecular Cloning: A Laboratory Manual, Third Edition, (2001) ColdSpring Harbor, N.Y.). A variety of different genes have been employed asselectable markers, and the particular gene employed in the subjectvectors as a selectable marker is chosen primarily as a matter ofconvenience. Known selectable marker genes include: the thimydine kinasegene, the dihydrofolate reductase gene, the xanthine-guaninephosporibosyl transferase gene, CAD, the adenosine deaminase gene, theasparagine synthetase gene, the antibiotic resistance genes, e.g., tetr,ampr, Cmr or cat, kanr or neor (aminoglycoside phosphotransferasegenes), the hygromycin B phosphotransferase gene, and the like. Vectorsmay provide for integration into the host cell genome, or may beautonomous from the host cell genome.

In certain embodiments, an expression cassette further provides fortargeting of the antigen to the surface of the host cell by producing anantigen operably linked to a cell surface targeting polypeptide. In suchembodiments, the antigen encoding nucleic acid may be operably linked toa cell surface targeting polypeptide-encoding nucleic acid in theexpression cassette, and transcription and subsequent translation of thenucleic acids provides for production of a fusion protein containing theantigen and the cell surface targeting polypeptide. As such, theexpression cassette can provide for targeting of an antigen to thesurface of a host cell, which antigen is not usually presented on thesurface of the host cell. Suitable cell surface targeting polypeptidesand their encoding nucleic acid sequences may be those of, for example,transmembrane serine threonine or tyrosine kinase receptors. Suitablecell surface targeting signals and their encoding nucleic acid sequencesinclude receptor transmembrane domains, such as the epidermal growthfactor receptor (EGFR) transmembrane domain (Ullrich, A. et al. Nature309: 418-425 (1984)). In many embodiments, the cell surface targetingsequence is derived from the same species as the host cell. Furtherexamples of strategies for targeting of polypeptides in a cell orprotein secretion may be found in U.S. Pat. No. 6,455,247.

Expression cassettes may be introduced into a host cell using a varietyof methods, including viral infection, transfection, conjugation,protoplast fusion, electroporation, particle gun technology, calciumphosphate precipitation, direct microinjection, viral vector delivery,and the like. The choice of method is generally dependent on the type ofcell being transformed and the circumstances under which thetransformation is taking place (i.e., in vitro). A general discussion ofthese methods can be found in Ausubel, et al, Short Protocols inMolecular Biology, 3rd ed., Wiley & Sons, 1995.

Accordingly, libraries may be produced using a number of means.

Methods

The invention thus provides a method for producing two polyclonalantisera, one directed to an “experimental” sample, usually containingat least one antigen, e.g., a protein, of interest, and a “control”,that usually does not contain any antigens of interest. Accordingly, theinvention provides two polyclonal antisera, one containing antibodiesthat specifically bind to antigens of interest, and another that doesnot. Identifying the antigens of interest is done by screening a libraryof antigens, e.g., one made from the “experimental” sample, as discussedabove, and identifying antigens that bind to antisera made using theexperimental sample in a higher amount than antisera made using thecontrol sample. In particular embodiments, therefore, the methodsinvolve differentially labeling the two polyclonal antibody populations,contacting the populations of labeled antibodies with a library ofantigens under conditions suitable for the binding of the antibodies tothe antigen, and identifying any antigens that are differentially boundby one population of antibodies as compared to the other. Antigens ofinterest may be identified because they are differentially bound by thepolyclonal antisera, and this may be observed by assessing the relativelevels of detectable labels bound to the antigen.

The subject populations of polyclonal antibodies may be differentiallylabeled using methods well known to the antibody arts (see e.g., Harlowand Lane (Using Antibodies: A Laboratory Manual, CSHL Press, 1999)). Inparticular, fluoroescent labels find use in the subject inventioninclude xanthene dyes, e.g. fluorescein and rhodamine dyes, such asfluorescein isothiocyanate (FITC), 6-carboxyfluorescein (commonly knownby the abbreviations FAM and F),6-carboxy-2′,4′,7′,4,7-hexachlorofluorescein (HEX),6-carboxy-4′,5′-dichloro-2′,7′-dimethoxyfluorescein (JOE or J),N,N,N′,N′-tetramethyl-6-carboxyrhodamine (TAMRA or T),6-carboxy-X-rhodamine (ROX or R), 5-carboxyrhodamine-6G (R6G¹ or G⁵),6-carboxyrhodamine-6G (R6G⁶ or G⁶), and rhodamine 110; cyanine dyes,e.g. Cy3, Cy5 and Cy7 dyes; coumarins, e.g umbelliferone; benzimidedyes, e.g. Hoechst 33258; phenanthridine dyes, e.g. Texas Red; ethidiumdyes; acridine dyes; carbazole dyes; phenoxazine dyes; porphyrin dyes;polymethine dyes, e.g. cyanine dyes such as Cy3, Cy5, etc; BODIPY dyesand quinoline dyes. Specific fluorophores of interest that are commonlyused in subject applications include: Pyrene, Coumarin,Diethylaminocoumarin, FAM, Fluorescein Chlorotriazinyl, Fluorescein,R110, Eosin, JOE, R6G, Tetramethylrhodamine, TAMRA, Lissamine, ROX,Napthofluorescein, Texas Red, Napthofluorescein, Cy3, and Cy5, etc.Antibodies may also be bound to chromogenic products.

In one embodiment of the invention, fluorocein (the active form is FITC)is used to label one antisera, while phycoerythrin (PE) is used to labelthe other antisera. In another embodiment, green fluorescent protein(GFP) is used to label one antisera, while a different fluorescentprotein is used to label the other. In another embodiment of theinvention, horse raddish peroxidase (HRP) is used to label one antisera,while alkaline phosphatase (AP) is used to label the other.

As mentioned above, the labels used in the subject methods aredistinguishable, meaning that the labels can be independently detectedand measured, even when the labels are mixed. In other words, theamounts of label present (e.g., the amount of fluorescence) for each ofthe labels are separately determinable, even when the labels areco-located (e.g., in the same tube or in the cell, plaque etc.).Suitable distinguishable fluorescent label pairs useful in the subjectmethods include Cy-3 and Cy-5 (Amersham Inc., Piscataway, N.J.), Quasar570 and Quasar 670 (Biosearch Technology, Novato Calif.), Alexafluor555and Alexafluor647 (Molecular Probes, Eugene, Oreg.), BODIPY V-1002 andBODIPY V1005 (Molecular Probes, Eugene, Oreg.), POPO-3 and TOTO-3(Molecular Probes, Eugene, Oreg.), and POPRO3 TOPRO3 (Molecular Probes,Eugene, Oreg.). Further suitable distinguishable detectable labels maybe found in Kricka et al. (Ann Clin Biochem. 39:114-29, 2002).

Once labeled, the two populations of labeled antibodies are used toscreen a library of antigens to identify an antigen that is produced ata higher level in an experimental sample, as compared to a controlsample. Since libraries of antigens may be obtained in a variety offormats, so, too, may the screening methods. The conditions for antibodybinding are generally those used for “Western blotting”, libraryscreening of bacteriophage expression libraries, and staining of cellswith antibodies, as are well known in the art (See Harlow, Sambrook andAusubel, supra).

Antigens of interest are usually identified because they aredifferentially bound by the labeled antibodies, and binding is assessedby assessing the level of antibody label associated with the antigen. Inmany embodiments, an antigen of interest is an antigen that isdifferentially bound by the antibodies such that the antigen is bound byexperimental sample antibodies to provide a signal that is 2× greater,at least 5× greater, at least 10× greater, at least 20× greater, atleast 50× greater or at least 100× greater, or more, as compared to thecontrol sample antibodies, after any normalization of the signals hasbeen performed.

The two labeled antisera are usually mixed in a suitable ratios,including an equimolar ratio, before binding to the library. In oneembodiment of the invention, excess amount of the antisera for the“control” is mixed with antisera for the “experimental” sample so thatonly dominant species that are present in the antisera for theexperimental sample but absent in the antisera for the control samplecan bind to the displayed proteins. This stringent condition favors theselection of antigens of interest. In another embodiment of theinvention, equimolar amounts of the antisera are mixed and allowed tobind to the displayed proteins or peptides. Antigens that areoverexpressed or underexpressed in either of the samples may beassessed.

For example, traditional plaque or colony-based methods using cDNAexpression libraries cloned into suitable vectors, e.g., bacteriophagelambda vectors may be used, where a library of polypeptides may beinduced in a cell using IPTG, the cell lysed and the lysate linked to asolid support, e.g., a nylon membrane or the like, and the nylonmembrane contacted with the antibodies under conditions suitable forantibody binding. Differentially labeled areas of the solid supports,corresponding to individual phage clones or colonies, may be identified.

In other embodiments, however, the library may be represented by anarray of proteins. In this embodiment, an array containing a pluralityof polypeptides is contacted with the pair of antibody populations, andpolypeptides on the array that are differentially bound by theantibodies using methods that are typically used in the DNA array arts.Methods for making and using microarrays of polypeptides are known inthe art (see e.g., U.S. Pat. Nos. 6,372,483, 6,352,842, 6,346,416 and6,242,266).

In many embodiments of particular interest, a cellular library, i.e.,animal cells expressing a cDNA library derived from the experimentalsample, as described above, may be used. In most embodiments, cells fromthe animal used for immunizations are used to host and express thelibrary. Accordingly, animal cells producing antigens of theexperimental sample cells may be made, and screened. Such cellularlibraries are usually contacted with the two antibody populations underconditions suitable for antibody binding, and cells that aredifferentially bound by the antibody populations may be separated fromother cells by, for example, flow cytometry, e.g. FACS. Method forperforming flow cytometry are generally well known in the art. Ingeneral, these methods involve passing a plurality of cells singlythrough a detector, e.g., a fluorescence detector, and cells withdesirable fluorescence are separated from other cells.

In one embodiment, host cells containing an expression library arecultured under condition suitable for expression of the experimentalsample cDNA. The cells are then fixed and permeablized so that theproteins expressed within the cells are bindable by the antisera. Theexpression host cells are bound by the antisera. The host cells can beisolated by fluorescence-activated cell sorting or by affinitychromatography method, such as by using protein A-coated beads. Thedifferentially stained cells are identified and isolated.

In another embodiment, host cells containing an expression library arefixed on a solid surface such as a tissue culture dish, with or withoutpermeablization, before allowed to be bound by the complex probes. Thisprocess is well known in the art and is called immunostaining orimmunocytochemistry. The differentially stained cells are identified andisolated.

The cDNA molecules that encode the displayed proteins can be isolatedfrom the identified cells, cloned and sequenced. Accordingly, an antigenof interest may be identified using the subject methods.

As is well known in the art, once cells producing an antigen of interestare identified, the nucleic acid encoding the antigen of interest may berecovered from the cell using well known methods, e.g., plasmid rescuein bacteria, PCR, plasmid excision, etc., and sequenced and otherwisestudied. The sequence of the nucleic acid encoding the antigen ofinterest becomes known, so too does the amino acid sequence of theantigen of interest.

Once the identity of the antigen of interest is known, monoclonalantibodies that specifically bind to the antigen of interest may be madeby traditional methods (see Harlow, supra). In one embodiment, if a hostcell from an animal is identified that produces the antigen of interest,that host cell may be cultured and used to inoculate a suitable animal,the suitable animal being of the same species as the host cell.Accordingly, polyclonal and monoclonal antibodies may be made for theantigen of interest.

With specific reference to FIG. 1, two antigen binding agentpopulations, i.e., complex probes, are produced, one for “tester” cellsor proteins, and the other for “driver” cells or proteins. The two agentpopulations are labeled with labels 1 or 2, the labeled populations aremixed and contacted with a library of antigens, e.g., proteins presentedon the surface of cells expressing a cDNA library, or an array ofantigens on a solid support, and proteins that are differentiallyproduced are identified because they are differentially bound by thebinding agent populations.

With specific reference to FIG. 2, a cDNA expression library is madefrom a suitable human cancer cell line (e.g., HeLa cells) and used toinfect suitable non-human animal cells, e.g., rabbit 240E cells. A humancancer cell line (e.g., HeLa cells) and suitable non-cancerous control(e.g., normal fibrobast cells), are used to immunize suitable non-humananimals, e.g., rabbits, to produce polyclonal antisera for the two celltypes. The two antisera are differentially labeled and used to“competitively stain” the infected non-human animal cells. The stainedcells are sorted by FACS according to their profile, deposited intoculture media and cultured. The cells may be directly used immunizerabbits to make rabbit monoclonal antibodies. The cDNA contained in thesorted cells may be sequenced to identify tumor associated surfaceantigens.

It is recognized that the present invention provides for a methodwherein the experimental and control samples are reversed to identifyantigens which are either increasing or decreasing between two samples.In certain embodiments, the antibodies may be “phage-display”antibodiesthat are well known in the art, or other specific binding moieties, suchas aptamers, etc.

Kits

Also provided by the subject invention are kits for practicing thesubject methods, as described above. The subject kits at least includeone or more of: an experimental and control sample, that, in particularembodiments are cellular samples, a cDNA library made from theexperimental sample that may be present in animal cells, two antibodypopulations reactive against experimental and control samples, labelingreagents for labeling the antibodies, etc. Other optional components ofthe kit include: components for performing antibody binding assays,e.g., buffers, etc. The various components of the kit may be present inseparate containers or certain compatible components may be precombinedinto a single container, as desired.

In addition to above-mentioned components, the subject kits typicallyfurther include instructions for using the components of the kit topractice the subject methods. The instructions for practicing thesubject methods are generally recorded on a suitable recording medium.For example, the instructions may be printed on a substrate, such aspaper or plastic, etc. As such, the instructions may be present in thekits as a package insert, in the labeling of the container of the kit orcomponents thereof (i.e., associated with the packaging or subpackaging)etc. In other embodiments, the instructions are present as an electronicstorage data file present on a suitable computer readable storagemedium, e.g. CD-ROM, diskette, etc. In yet other embodiments, the actualinstructions are not present in the kit, but means for obtaining theinstructions from a remote source, e.g. via the internet, are provided.An example of this embodiment is a kit that includes a web address wherethe instructions can be viewed and/or from which the instructions can bedownloaded. As with the instructions, this means for obtaining theinstructions is recorded on a suitable substrate.

Also provided by the subject invention is are kits including at least acomputer readable medium including programming as discussed above andinstructions. The instructions may include installation or setupdirections. The instructions may include directions for use of theinvention with options or combinations of options as described above. Incertain embodiments, the instructions include both types of information.

Providing the software and instructions as a kit may serve a number ofpurposes. The combination may be packaged and purchased as a means foridentifying antigens that are present at different amounts between twosamples, including present and absent.

Utility

The invention provides a method for identifying antigens, e.g.,obtaining an amino acid sequence of an antigen, that is present ingreater amounts in one sample as compared to another. Accordingly, thesemethods have several applications, a representative many of which willbe described below.

In one embodiment, the identified antigen may be an a protein that isonly present in abnormal, e.g., cancerous or pathogen-infected cells, ascompared to normal, e.g., non-cancerous or non-pathogen-infected cells.Accordingly, such a protein may be used as a target for drug, e.g.,antibody or small molecule, therapy. Drug screening assays, which aregenerally well known in the art, may be used to identify such drugs. Inparticular embodiments, since the identified antigen may be an antigenon the surface of a cell, the drugs, in particular monoclonal antibodiesthat specifically bind to the antigen, may be made and screened forcytotoxic or other inhibitory activity against cells producing theidentified antigen on the surface.

In another embodiment, the subject methods may be used in research, tounderstand the molecular events that are associated with an alterationof a cell (e.g., upon contacting the cell with a chemical orenvironmental stimulant, pathogen, or a change of a cell during itsdevelopment, etc.).

In other embodiments, the subject methods may simply be used toinvestigate the differences between to cell types, e.g., cells from twodifferent tissues, or the like.

EXAMPLES

The following examples are put forth so as to provide those of ordinaryskill in the art with a complete disclosure and description of how tomake and use the present invention, and are not intended to limit thescope of what the inventors regard as their invention nor are theyintended to represent that the experiments below are all or the onlyexperiments performed. Efforts have been made to ensure accuracy withrespect to numbers used (e.g. amounts, temperature, etc.) but someexperimental errors and deviations should be accounted for. Unlessindicated otherwise, parts are parts by weight, molecular weight isweight average molecular weight, temperature is in degrees Centigrade,and pressure is at or near atmospheric.

Example 1 DISC Technology

Dual Immunostaining-mediated Subtractive Cloning (DISC) technology maybe used to identify disease-specific cell surface antigens, e.g.tumor-associated antigens (TAAs). Rabbit 240E cells are capable ofexpressing the entire range of human proteins, including normal andtumour-specific proteins. After infecting 240E cells with a HeLa cellcDNA library, some of the cells will express HeLa TAAs. A cell lineexpressing a particular TAA should preferentially bind a subset of the“anti-HeLa” polyclonal antibodies. In contrast, this same cell lineshould not bind the anti-NHDF (normal human dermal fibroblast)antibodies. By differentially labeling the two polyclonal antibodies andbinding them competitively to the transformed 240E cell lines (i.e.,showing competitive staining), we expect to highlight cells expressingTAAs and to sort them using FACS. After FACS analysis, the isolated TAAcell lines will be cultured and used in two ways. First, tumor antigenDNA will be amplified and sequenced. Second, stable TAA cell lines willbe injected into rabbits to generate tumor antigen-specific antibodies.In fact, we have used 240E cells transfected with human genes togenerate rabbit monoclonal Abs against numerous cell surface receptors.

Example 2 Generation of High Titer Rabbit Antiserum Against HeLa Cellsand Normal Fibroblast Cells

HeLa-S3 cells and normal human dermal fibroblast cells (NHDF) were grownto 80% confluency. Cells were detached using a non-enzymatic method (5mM EDTA in culture medium) to minimize damage of membrane proteins.Cells were washed in PBS before s.c. injection into rabbits at 10⁷cells/rabbit. Three rabbits were immunized as triplicate for each celltype. Due to the death of one NHDF rabbit, we used antisera from tworabbits that produced the highest titers in each group. Six sequentialimmunizations were carried out on a weekly basis. Titers were monitoredusing the cell-ELISA assay. Briefly, HeLa-S3 cells and fibroblasts werefixed in 96 well plates. Antisera from the immunized rabbits wereserially diluted and applied to the fixed cells. Binding reactions werequantitated by peroxidase-conjugated anti-rabbit antibodies andchromogenic substrate diaminobenzidine (DAB) and confirmed by alkalinephosphatase-conjugated anti-rabbit antibodies followed by PNPPsubstrate. The titer of the polysera reached 1:50,000 for NHDF and1:100,000 for HeLa.

Example 3 Purification of Polyclonal Antibodies From Antisera of WholeCell Immunized Rabbits

IgG from rabbits immunized with HeLa-S3 or NHDF cells were purified fromthe antisera with ImmunoPure Immobilized Protein G (Pierce # 20398). Theelution of bound proteins was monitored by absorbance at 280 nm. Theeluted immunoglobulin fractions were desalted with Pierce Desaltingcolumns. Sample emergence was similarly monitored by measuring theabsorbance of each fraction at 280 nm. 15 mgs of IgG were purified andpooled from 3 different bleeds of antisera from duplicate rabbits. 17mgs of IgG were similarly obtained from the two NHDF immunized rabbits.

Example 4 Differential Labeling of the Purified IgG With FluorescentDyes

Anti-HeLa-S3 IgG was labeled with PE (phycoerythrin, MW 240,000). PE isa member of the phycobiliprotein family isolated from marine algae. Theexcitation and emission wavelengths of PE-labeled proteins areapproximately 488 and 578 nm. PE was coupled to purified anti-HeLaantibodies as follows: Two heterobifunctional reagents succinimidyl3-(2-pyridyldithio)propionate (SPDP) and succinimidyltrans-4-(maleimidylmethyl)cyclohexane-1-carboxylate (SMCC) were used.First, the lysine residues of PE are converted to a pyridyldisulfidederivative with SPDP, then reduced to thiol-PE with the reducing agenttris-(2-carboxyethyl) phosphine (TCEP). Second, lysines of the antibodyare converted to thiol-reactive maleimides with a heterobifunctionalcrosslinking reagent, SMCC. Thiolated PE (PE-SH) and Ab-maleimide weremixed and crosslinked to each other through the formation of a stablethioether bond. The reaction was terminated by the addition of 20-foldmolar excess of N-ethylmaleimide (NEM) to “cap” the remaining freethiols.

Anti-NHDF IgG was labeled with FITC (fluorescein isothiocynate, MW 389).FITC reacts with the primary amines of proteins to form the dye-proteinconjugates. The excitation and emission wavelength of FITC-labeledproteins are approximately 494 nm and 520 mm, respectively. We foundthat the ratio of 50:1 (FITC:IgG) is optimal for FITC labeling.

Example 5 Production of Retroviruses Expressing HeLa cDNA Library, andDevelopment of a Highly Efficient Protocol to Infect Rabbit 240E Cells

A HeLa cell retroviral cDNA library was transformed into XL10-Gold E.coli cells and was then titered using serial dilution. To make libraryDNA, we then plated the XL10-Gold cells at a density of 1×10⁴ in sixty100 mm plates. Cells were grown at 37° C. overnight. 5 ml of LB mediumwas added per plate used to gently suspend the bacterial library using acell scraper. The suspension was collected, pooled and incubate at 37°C. for no longer than 1 hr with constant shaking. HeLa expressionlibrary DNA was extracted with a QIAfilter plasmid kit. A total of 1.8mg of plasmid library DNA was obtained. To generate retroviralparticles, 293T cells were triple-transfected with an pFB XR plasmidcDNA library, made with a replication-defective vector, and with twoadditional packaging vectors pVPack-GP (gag-pol-expressing vectorencoding internal structure proteins and reverse transcriptase) andpVPack-VSV-G (env-expressing vector encoding the viral envelopeprotein). Viral supernatant was collected 48-72 hrs post-transfection.

To determine viral infection rates in rabbit 240E cells, we infected240E cells with pFB-Neo-LacZ, a plasmid that is similar to the libraryplasmid. Using standard retroviral infection methods, we found the titerof pFB-Neo-LacZ infection of 240E rabbit cells was 1-2%, however, wefound that centrifugation of mammalian cells at 2400 rpm, 18° C. for 3hrs in the presence of viral particles dramatically increased viralinfection of all three cell lines up to 100 fold (to 70% transducedcells). Using beta-Gal as a reporter gene, we showed that, by in situstaining, 70% of 240E cells can be infected.

Example 6 Identification of “HeLa-Specific” cDNA-Expressing 240E Cellsby FACS, Using Differentially Labeled Polyclonal Antibody Probes

As mentioned above, the infected 240E cells that express HeLa cellsurface antigens will be preferentially bound by the anti-HeLapolyclonal antibodies. Since the anti-NHDF antibodies should notrecognize TAAs, we expect that the infected 240E cells expressingtumor-specific antigens will preferentially bind the anti-HeLaantibodies in a competition with both anti-HeLa and anti-NHDF polyclonalantibodies. Normal human surface antigens should be recognized by boththe anti-HeLa and the anti-NHDF polyclonal preparations. 240E cells thatdo not express human proteins bind neither of the two polyclonalantibodies (see below).

240E cells, untreated or infected with the HeLa cell retroviral cDNAlibrary (7×10⁵ cells/sample), were incubated in 100 μl of blockingbuffer (1×PBS plus 2% FBS) at room temperature for 1 hr. Cells werestained with 4 μg of PE-conjugated anti-HeLa IgG, 4 μg ofFITC-conjugated anti-NHDF IgG individually and in combination, andincubated on ice for 20-30 mins. In a parallel experiment, we performedan immuno-depletion step to remove potential non-specific antibodiesthat bind to rabbit 240E cells (referred as 240E-subtracted antibody).For this step, antibodies were incubated with 7×10⁵ 240E cells in 100 μlof blocking buffer on ice for 30 mins. The 240E cells were then removedby centrifugation, and the remaining IgG in the supernatants was used tostain untreated or cells infected with the HeLa cell retroviral cDNAlibrary. After the binding reaction, antibody-stained control andlibrary-infected 240E cells were centrifuged at 1500 rpm for 5 min.After 2 washes with blocking buffer, cells were resuspended in 400 μlblocking buffer and kept at 4° C. in the dark before FACS analysis.

Example 6 Staining of 240E Cells and FACS Analysis

First, anti-HeLa (PE) and anti-NHDF (FITC) antibody probes were usedindividually to stain library infected and non-infected 240E cells. FIG.3 shows 240E cells of non-infected control (black line with fill) andHeLa library-infected cells (line, no fill) are immunostained withPE-labeled polyclonal anti-HeLa antibody probe (a and b), or withFITC-labeled polyclonal anti-NHDF antibody probe (c and d). The antibodyprobes are either without pre-absorption (immuno-depletion) with 240Ecells (a and c) or with immuno-depletion to decrease 240E backgroundbinding (b and d). Cells are subjected to flow cytometry on FACS Calibermachine. As shown in FIG. 3 a, an apparent shift in cell population wasobserved when library-infected 240E cells are compared to non-infectedcells, both stained with anti-Hela (PE) probes. With an arbitrary gatingM1 and scoring 10,000 events for each cell population, 39.33% of controlcells fall into the gate, whereas 65.77% of library-infected cells fallinto the same gate. The difference (15%) reflects the difference basedon this gating. Similar results were obtained with a slight improvement(18% difference) when 240E-subtracted antibody was used in the assay(FIG. 3 b). This is expected, as IgG molecules that cross-react with240E cells were removed before the binding reaction. In contrast, whenanti-NHDF (FITC) probes were used (FIG. 3 c), there is a very slightdifference (less than 2%) between when library-infected 240E cells andnon-infected cells. This reflects the fact that anti-NHDF antibodiesrecognize only a negligible fraction of HeLa surface proteins displayedon 240E cells in this cytometry setting. It is likely that when a highernumber of library-infected cells is analyzed, the anti-NHDF antibodieswill detect more HeLa proteins that share common epitopes (but thefraction will be the same). Similar results were obtained using the240E-subtracted antibody probe (FIG. 3 d).

Second, anti-HeLa (PE) and anti-NHDF (FITC) antibody probes weresimultaneously used to stain library infected or non-infected 240E cells(FIG. 4). FIG. 4 shows 240E cells of non-infected control (left panel)and HeLa library-infected cells (right panel) are immunostained withPE-labeled polyclonal anti-HeLa antibody probe and FITC-labeledpolyclonal anti-NHDF antibody probe simultaneously. The antibody probesare without pre-absorption (immuno-depletion) with 240E cells. Theantibody probes with immuno-depletion showed the same results (notshown). An equal amount (4 μg) of each type of antibody probe was addedto the cells. 1) FACS analysis showed that the majority of cells are inthe lower left quadrant, indicating that the majority of the libraryinfected cells and control cells do not express surface antigens thatare bound by the antibody probes. 2) Noticeably, cell numbers in thePE+/FITC− population (upper left quadrant) increased 4 fold from 0.19%for control 240E cells, to 0.75% for library-infected 240E cells. Thisindicates that this method distinguishes the cells that express HeLaspecific cell surface genes, being bound only by anti-HeLa but notanti-NHDF antibodies. 3) There is also a 3 fold increase in thepopulation of PE+/FITC+, from 0.04% for control cells to 0.13% forlibrary-infected cells. This indicates the existence of common antigensthat are recognized by antibodies for both HeLa and NHDF cell lines. 4)In agreement with the single probe binding assay, few cells weredetected in the PE−/FITC+ quadrant. Anti-NHDF (FITC) antibody probeswere shown in the same experiment to be able to stain NHDF cells (notshown).

Our experiments demonstrated that identification of rabbit cellsexpressing tumor specific surface markers is possible by usingcompetitive immuno-staining with differentially labeled antibody probes.Our experiments showed that the rabbit 240E cell line is a valuablesystem for this technology. First, the cell line can be effectivelyinfected using a retroviral expression library. Second, the cell linehas very low background when an anti-human cell polyclonal antibody isused for staining. In fact, immunodepletion is not needed to removeantibodies that bind to rabbit cells. Third, the stable cell lines thatare derived after FACS sorting can be used to immunize rabbits directlyin order to generate rabMAbs against the stably expressed human cDNA. Bydefining an appropriate threshold (gate), a manageable number ofPE+/FITC− cells can be sorted and studied.

Example 7 Confirmation of the Expression of Hela Proteins on 240E Clonesand PCR Characterization of the Integrated Genes

PCR and Cloning of the Hela Gene Integrated in 240E Cells

Six clones were grown out of the single cells deposited in three 96-wellplates. (Note: We have since optimized the condition for single cellgrowth, and expect a higher efficiency in the future.) Three clones(named Clone 2, 3 and 5) were further expanded in 24-well and then in6-well plates, while the other 3 clones are in 96-well stage. DNA ofthese clones was extracted using QIAGEN genomic DNA purification kit.cDNA inserts were amplified using TaqPlus Precision PCR system(Stratagene), and 5′-Retro primer (5′-GGCTGCCGACCCCGGGGGTGG-3′ (SEQ IDNO:1) and 3′-pFB (5′-CGAACCCCAGAGTCCCGCTCA-3′ (SEQ ID NO:2)) as primers.Briefly, 2.5 ul 10× TaqPlus Precision buffer, 0.25 ul dNTP mix (25 mM ofeach nucleotide), 0.5 ul 5′-Retro primer (100 ng/ml), 0.5 ul 3′-pFBprimer (100 ng/ml), and 0.5 ul TaqPlus Precision polymerase mixture weremixed and subjected to PCR reaction using MJ Research PTC-200 ThermalCycler. The PCR program is as following:

1 cycle, 95 C for 1 min; 40 cycles, 95 C for 1 min, 64 C for 1 min, 72 Cfor 5 min; 1 cycle, 72 C for 10 min.

As shown in FIG. 5, Clone 2 showed a single specific band, while somenon-specific bands were observed for Clone 3. No PCR amplification wasseen for Clone 5. TA cloning and sequencing of the PCR amplified productare in progress. Similar PCR amplification will be conducted for therest of the clones.

FACS Analysis of the 240E Stable Clones

To confirm the expression of Hela-specific membrane proteins on the 240Estable cell lines, FACS analysis was used to analyze each of the threeclones that have been analyzed by PCR (results shown in FIG. 6). Inaddition, Clone 6 which was yet examined by PCR was also included in theFACS experiment. Anti-Hela polyclonal antibody and anti-NHDF polyclonalantibody (Part 2) were used. The result of the FACS analysis is shown inFIG. 6. In agreement with the PCR result, Clone 2 showed specificbinding by anti-Hela IgG, but not anti-NHDF IgG. Clone 3 and 5 did notshow apparent IgG binding. We also found that Clone 6 express highlevels of Hela-specific membrane protein (data not shown). Theintegrated DNA sequence will be determined for all of the positiveclones. It can be concluded that DISC is an efficient technique to clonegenes of cell-specific membrane proteins.

In certain embodiments of the invention, the complex probe is aplurality of artificial binding proteins, e.g., protein domains such asfibronectin domains can be randomized and used as a library of bindingproteins to a target.

In another embodiment of the invention, the complex probe is nucleicacid molecules such as aptomers. It is known that nucleic acid moleculescan adopt different conformations and binding to proteins with certainaffinity.

It is evident from the above results and discussion that the subjectinvention provides an important new means for identifying an antigenthat is differentially expressed between two cell types, or fractionsthereof. As such, the subject methods and systems find use in a varietyof different applications, including research, therapeutic and otherapplications. Accordingly, the present invention represents asignificant contribution to the art.

While the present invention has been described with reference to thespecific embodiments thereof, it should be understood by those skilledin the art that various changes may be made and equivalents may besubstituted without departing from the true spirit and scope of theinvention. In addition, many modifications may be made to adapt aparticular situation, material, composition of matter, process, processstep or steps, to the objective, spirit and scope of the presentinvention. All such modifications are intended to be within the scope ofthe claims appended hereto.

1. A method of identifying an antigen that is present in first andsecond samples in different amounts, comprising, a) immunizing a firstand second rabbit with a first and a second sample of human cells or afraction thereof to generate a first and a second population of rabbitantibodies, respectively; b) distinguishably labeling said first andsecond population of antibodies, c) contacting said first and secondpopulations of labeled antibodies with a plurality of non-humanmammalian cells producing a library of human proteins on their cellsurfaces; d) sorting a non-human mammalian cell that is differentiallybound by said first and second populations of antibodies, and e)identifying a human protein on a surface of said non-human mammaliancell, wherein said human protein is an antigen that is present in saidfirst and second samples in differing amounts.
 2. The method of claim 1,wherein said sorting is based on a ratio of levels of binding of saidfirst and second population of antibodies to said non-human mammaliancell, wherein said first and second population of antibody independentlyprovide a signal, and the ratio of the two signals indicates saidnon-human mammalian cell that is differentially bound.
 3. The method ofclaim 1, wherein said first sample is an abnormal cell and said secondsample is a normal cell.
 4. A method of identifying a differentiallyexpressed protein, comprising, a) distinguishably labeling a first and asecond population of polyclonal antibodies that are reactive against acancerous human cell and a non-cancerous human cell, respectively; b)contacting said first and second populations of labeled antibodies witha plurality of non-human mammalian cells producing human proteins; andc) identifying a non-human mammalian cell producing a protein that isdifferentially bound by said first and second populations of antibodies,wherein said protein is differentially expressed.
 5. The method of claim4, wherein said human proteins are produced on a surface of saidnon-human mammalian cells.
 6. The method of claim 5, wherein saidnon-human mammalian cell is a rabbit cell.
 7. The method of claim 4,wherein said plurality of non-human mammalian cells express a library ofrecombinant human proteins.